{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lab 2 - Clustering with GraphX\n",
    "In this lab we use GraphX to cluster music songs according to the tags attached to each song. The datasets that we are working is available under the folder `data`, and also can be downloaded from [here](http://www.cs.cornell.edu/~shuochen/lme/data_page.html). The datasets consist of the following files:\n",
    "* `song_hash.txt`: this file maps a song ID to its title and artist.\n",
    "  + Each line corresponds to one song, and has the format called `Integer_ID\\tTitle\\tArtist\\n`.\n",
    "* `tag_hash.txt`: this one maps a tag ID to its name.\n",
    "  + Each line corresponds to one song, and has the format called `Integer_ID, Name\\n`.\n",
    "* `tags.txt`: this file includes the social tags by using the integer ID to represent songs.\n",
    "  + The tag data file has the same number of lines as the total number of songs. Each line consists of the IDs of the tags for a song, represented by integers, and separated by space. If a song does not have a tag, its line is just a #. Note that for the tag file, there is no space at the end of each line.\n",
    "\n",
    "### Load the data into a Spark graph property\n",
    "Let's start by loading the data into a Spark graph property. To do this, wee define a case class `Song`, with three attributes: `title`, `artist`, and a set of `tags`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Intitializing Scala interpreter ..."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "Spark Web UI available at http://a05cbddd16bc:4040\n",
       "SparkContext available as 'sc' (version = 2.4.4, master = local[*], app id = local-1570381644485)\n",
       "SparkSession available as 'spark'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "import scala.io.Source\n",
       "import org.apache.spark.rdd.RDD\n",
       "import org.apache.spark.graphx._\n",
       "import org.apache.spark.mllib.clustering.PowerIterationClustering\n",
       "defined class Song\n"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import scala.io.Source\n",
    "\n",
    "import org.apache.spark.rdd.RDD\n",
    "import org.apache.spark.graphx._\n",
    "import org.apache.spark.mllib.clustering.PowerIterationClustering\n",
    "\n",
    "case class Song(title: String, artist: String, tags: Set[String])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, import the songs from the dataset file `song_hash.txt` into `RDD[(VertexId, Song)]`, called `songs`, and initialize each song with an empty set of tags. You should see the following lines, if you run `songs.take(5).foreach(println)`:\n",
    "```\n",
    "(0,Song(Gucci Time (w\\/ Swizz Beatz),Gucci Mane,Set()))\n",
    "(1,Song(Aston Martin Music (w\\/ Drake & Chrisette Michelle),Rick Ross,Set()))\n",
    "(2,Song(Get Back Up (w\\/ Chris Brown),T.I.,Set()))\n",
    "(3,Song(Hot Toddy (w\\/ Jay-Z & Ester Dean),Usher,Set()))\n",
    "(4,Song(Whip My Hair,Willow,Set()))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0,Song(Gucci Time (w\\/ Swizz Beatz),Gucci Mane,Set()))\n",
      "(1,Song(Aston Martin Music (w\\/ Drake & Chrisette Michelle),Rick Ross,Set()))\n",
      "(2,Song(Get Back Up (w\\/ Chris Brown),T.I.,Set()))\n",
      "(3,Song(Hot Toddy (w\\/ Jay-Z & Ester Dean),Usher,Set()))\n",
      "(4,Song(Whip My Hair,Willow,Set()))\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "fileName: String = data/song_hash.txt\n",
       "songs: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, Song)] = MapPartitionsRDD[3] at map at <console>:35\n"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val fileName = \"data/song_hash.txt\"\n",
    "\n",
    "var songs: RDD[(VertexId, Song)] = sc.textFile(fileName).map(_.split(\"\\t\")).map(line => (line(0).trim.toLong, new Song(line(1), line(2), Set())))\n",
    "\n",
    "songs.take(5).foreach(println)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, create a graph property called `graphFromSongs`, whose nodes are the songs, i.e., `Graph[Song, Int]`. It will not add any edges into the graph at first. If you print the vertices of the `graphFromSongs` by calling `graphFromSongs.vertices.take(5).foreach(println)`, you should see something as below:\n",
    "```\n",
    "(1084,Song(Tequila Sunrise,Fiji,Set()))\n",
    "(1410,Song(The Sweetest Taboo,Sade,Set()))\n",
    "(3066,Song(Bow Chicka Wow Wow,Mike Posner,Set()))\n",
    "(1894,Song(Love Your Love The Most,Eric Church,Set()))\n",
    "(466,Song(Stupify,Disturbed,Set()))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1084,Song(Tequila Sunrise,Fiji,Set()))\n",
      "(1410,Song(The Sweetest Taboo,Sade,Set()))\n",
      "(3066,Song(Bow Chicka Wow Wow,Mike Posner,Set()))\n",
      "(1894,Song(Love Your Love The Most,Eric Church,Set()))\n",
      "(466,Song(Stupify,Disturbed,Set()))\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "zeroEdge: org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[Int]] = ParallelCollectionRDD[4] at parallelize at <console>:35\n",
       "graphFromSongs: org.apache.spark.graphx.Graph[Song,Int] = org.apache.spark.graphx.impl.GraphImpl@8323d7d\n"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val zeroEdge: RDD[Edge[Int]] = sc.parallelize(Nil)\n",
    "val graphFromSongs: Graph[Song, Int] = Graph(songs,zeroEdge)\n",
    "\n",
    "graphFromSongs.vertices.take(5).foreach(println)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extract the features of nodes\n",
    "Now, let's join the tags from the dataset called `tags.txt` into the nodes. To do this, we first need to create `tagRDD`, which is in form of `RDD[(VertexId, Set[String])]`. We hen join it with `graphFromSong`. For now, we have only the mapping between the song ID and the set of tag IDs in our `tagRDD`. If we print the first five items of the `tagRDD` we should see the following lines:\n",
    "```\n",
    "(0,Set(115, 173))\n",
    "(1,Set(62, 88, 110, 90, 123, 155, 173, 14, 190, 214, 115, 27))\n",
    "(2,Set(115, 173))\n",
    "(3,Set(2, 72, 173))\n",
    "(4,Set(62, 88, 141, 107, 24, 155, 72, 6, 126, 173, 190, 115, 2, 52))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tagFile: String = data/tags.txt\n",
       "tagIter: Iterator[(org.apache.spark.graphx.VertexId, Set[String])] = non-empty iterator\n",
       "tagRDD: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, Set[String])] = ParallelCollectionRDD[17] at parallelize at <console>:38\n"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val tagFile = \"data/tags.txt\"\n",
    "\n",
    "val tagIter: Iterator[(VertexId, Set[String])] = Source.fromFile(tagFile).getLines.zipWithIndex.map {\n",
    "    x => val tags = x._1 split \" \" \n",
    "    (x._2.toLong, tags.toSet)\n",
    "}\n",
    "\n",
    "val tagRDD = sc.parallelize(tagIter.toSeq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What we want is to extract the tag names from `tag_hash.txt` given the tag ID. We can now call `joinVertices` on `graphFromSongs`, and pass the RDD of tags `tagRDD` with a function that extracts the tags. Note that in the dataset called `tags.txt`, a `#` assigned next to the song ID means that no tag is associated with that song. In such a case, we simply return the initial song with an empty tag, otherwise, we add the set of tags into the song. Below, you can see an example output if you run `songsNtags.vertices.take(5).foreach(println)`:\n",
    "```\n",
    "(1084,Song(Tequila Sunrise,Fiji,Set()))\n",
    "(1410,Song(The Sweetest Taboo,Sade,Set(beautiful, 1980s, 80's, rock, sexy, favorite, female vocalist, smooth, female, pop, best, 80s, soul, easy listening, classic, chillout, urban, lounge, romantic, soft, singer-songwriter, lovely, chill, dance, love songs, mellow, love, jazz, female vocals, british, female vocalists, rnb, smooth jazz, favorites, ballad, relax, adult contemporary, great, melancholy, relaxing)))\n",
    "(3066,Song(Bow Chicka Wow Wow,Mike Posner,Set(sexy, smooth, male vocalists, pop, <3, hip-hop, love song, wdzh-fm, awesome, easy listening, love at first listen, r&b, electronic, love songs, love, american, 2010s, nostalgic, rnb, favorites, wkqi-fm)))\n",
    "(1894,Song(Love Your Love The Most,Eric Church,Set(beautiful, makes me smile, fav, pop, modern country, country, great song, love songs, favourite songs, usa, love, favorites, male country, my favorite)))\n",
    "(466,Song(Stupify,Disturbed,Set(punk rock, rock, 90s, alternative metal, favorite, favourite, favorite songs, good, nu metal, 00s, angry, progressive rock, awesome, male vocalist, hardcore, sex, alternative, favs, heavy, 2000s, heavy metal, classic rock, hard rock, alternative rock, american, female vocalists, catchy, favorites, metal, my music, nu-metal, aitch)))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1084,Song(Tequila Sunrise,Fiji,Set()))\n",
      "(1410,Song(The Sweetest Taboo,Sade,Set(beautiful, 1980s, 80's, rock, sexy, favorite, female vocalist, smooth, female, pop, best, 80s, soul, easy listening, classic, chillout, urban, lounge, romantic, soft, singer-songwriter, lovely, chill, dance, love songs, mellow, love, jazz, female vocals, british, female vocalists, rnb, smooth jazz, favorites, ballad, relax, adult contemporary, great, melancholy, relaxing)))\n",
      "(3066,Song(Bow Chicka Wow Wow,Mike Posner,Set(sexy, smooth, male vocalists, pop, <3, hip-hop, love song, wdzh-fm, awesome, easy listening, love at first listen, r&b, electronic, love songs, love, american, 2010s, nostalgic, rnb, favorites, wkqi-fm)))\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "songsNtags: org.apache.spark.graphx.Graph[Song,Int] = org.apache.spark.graphx.impl.GraphImpl@70474515\n"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val songsNtags = graphFromSongs.joinVertices(tagRDD) {\n",
    "  (id, s, ks) => ks.toList match {\n",
    "    case List(\"#\") => s\n",
    "    case _ => {\n",
    "      val taghashFile = \"data/tag_hash.txt\"\n",
    "      val tags: Map[Int, String] = Source.fromFile(taghashFile).getLines().map {\n",
    "          line => val row = line split \", \"\n",
    "          row(0).toInt -> row(1)\n",
    "      }.toMap\n",
    "      val songTags = ks.map(_.toInt).flatMap(tags.get)\n",
    "      Song(s.title, s.artist, songTags.toSet)\n",
    "    }\n",
    "  }\n",
    "}\n",
    "\n",
    "songsNtags.vertices.take(3).foreach(println)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define a similarity measure between two nodes\n",
    "Now, let's measure the similarity between two songs using the *Jaccard metric*, which is the ratio of the number of common tags between two songs, and their total number of tags. If none of the songs is tagged, we assume that their similarity score is zero. For this, we define the `similarity` method as below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "// def similarity(one: Song, other: Song): Double = {\n",
    "//   val numCommonTags = (one.tags intersect other.tags).size\n",
    "//   val numTotalTags = (one.tags union other.tags).size\n",
    "//   if (numTotalTags > 0)\n",
    "//     numCommonTags.toDouble / numTotalTags.toDouble\n",
    "//   else \n",
    "//     0.0\n",
    "// }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating an affinity matrix\n",
    "Now, we need to calculate the similarity between each pair of songs in our database. If there are 1,000 songs, we will have to compute, and store, one million similarity scores. What if we had 1,000,000 songs? Obviously, computing similarities between every pair will be inefficient. Instead, we can restrict this to the songs that have a relatively high similarity score. At the end of the day, we want to cluster songs that are similar. Therefore, we will filter the nodes with the following function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "// def quiteSimilar(one: Song, other: Song, threshold: Double): Boolean = {\n",
    "//   val commonTags = one.tags.intersect(other.tags)\n",
    "//   val combinedTags = one.tags.union(other.tags)\n",
    "//   commonTags.size > combinedTags.size * threshold\n",
    "// }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a function to remove the duplicate songs in our graph data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "//def differentSong(one: Song, other: Song): Boolean = one.title != other.title || one.artist != other.artist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With these two functions, we can now create `RDD[Edge[Double]]` that will contain a similarity measure between the nodes that are quite similar. A simple check `similarConnections.count` shows that we only need to store 1,506 similarity scores instead of 10 million."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "similarity: (one: Song, other: Song)Double\n",
       "quiteSimilar: (one: Song, other: Song, threshold: Double)Boolean\n",
       "differentSong: (one: Song, other: Song)Boolean\n",
       "songs: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, Song)] = VertexRDDImpl[20] at RDD at VertexRDD.scala:57\n",
       "similarConnections: org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[Double]] = MapPartitionsRDD[29] at map at <console>:71\n"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// I copied the functions in the same cell because otherwise there were errors\n",
    "def similarity(one: Song, other: Song): Double = {\n",
    "  val numCommonTags = (one.tags intersect other.tags).size\n",
    "  val numTotalTags = (one.tags union other.tags).size\n",
    "  if (numTotalTags > 0)\n",
    "    numCommonTags.toDouble / numTotalTags.toDouble\n",
    "  else \n",
    "    0.0\n",
    "}\n",
    "\n",
    "def quiteSimilar(one: Song, other: Song, threshold: Double): Boolean = {\n",
    "  val commonTags = one.tags.intersect(other.tags)\n",
    "  val combinedTags = one.tags.union(other.tags)\n",
    "  commonTags.size > combinedTags.size * threshold\n",
    "}\n",
    "\n",
    "def differentSong(one: Song, other: Song): Boolean = one.title != other.title || one.artist != other.artist\n",
    "\n",
    "// First, get the songs with tags\n",
    "songs = songsNtags.vertices\n",
    "\n",
    "// Then, compute the similarity between each pair of songs with a similarity score larger than 0.7\n",
    "val similarConnections: RDD[Edge[Double]] = {\n",
    "  val ss = songs.cartesian(songs)\n",
    "   \n",
    "  // similarSongs are the songs with a similarity score larger than 0.7\n",
    "  // Note that not compare a song with itself (use the differentSong method)\n",
    "  val similarSongs = ss.filter { p => \n",
    "          ((p._1._1 != p._2._1) && \n",
    "          quiteSimilar(p._1._2, p._2._2, 0.7) && \n",
    "          differentSong(p._1._2, p._2._2))\n",
    "  }\n",
    "    \n",
    "  // Now compute the Jaccard metric for the similarSongs. The result should ba an Edge for each pair of songs\n",
    "  // with the vertexIds at the two ends, and the Javvard values as the edge value.\n",
    "  similarSongs.map { p => {\n",
    "    val jacIdx = similarity(p._1._2, p._2._2)\n",
    "    Edge(p._1._1, p._2._1, jacIdx) }\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create our similarity graph `similarByTagsGraph`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "similarByTagsGraph: org.apache.spark.graphx.Graph[Song,Double] = org.apache.spark.graphx.impl.GraphImpl@4bb0b917\n"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val similarByTagsGraph = Graph(songs, similarConnections)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some of our songs have very few tags, so let's filter those out and keep the songs with more than five tags."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "similarHighTagsGraph: org.apache.spark.graphx.Graph[Song,Double] = org.apache.spark.graphx.impl.GraphImpl@66e4a795\n"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val similarHighTagsGraph = similarByTagsGraph.subgraph(vpred = (id: VertexId, attr: Song) => attr.tags.size > 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look closer into the graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Song(Fancy (w\\/ T.I. & Swizz Beatz),Drake,Set(rap, wdzh-fm, 2010, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) => 0.8571428571428571\n",
      "Song(Roll With It,Easton Corbin,Set(new country, modern country, country, great song, 2010, my favorite)) ~~~ Song(You Lie,The Band Perry,Set(new country, modern country, country, great song, female vocalists, my favorite)) => 0.7142857142857143\n",
      "Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(Fancy (w\\/ T.I. & Swizz Beatz),Drake,Set(rap, wdzh-fm, 2010, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) => 0.8571428571428571\n",
      "Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(I'm Going In (w\\/ Young Jeezy & Lil Wayne),Drake,Set(rap, hip-hop, wdzh-fm, wjlb-fm, whtd-fm, wkqi-fm)) => 0.7142857142857143\n",
      "Song(Everything Falls,Fee,Set(rock, worship, christian, contemporary christian, good stuff, christian rock)) ~~~ Song(Needful Hands,Jars Of Clay,Set(rock, worship, christian, contemporary christian, christian rock, favorites)) => 0.7142857142857143\n"
     ]
    }
   ],
   "source": [
    "similarHighTagsGraph.triplets.take(5).foreach(t => println(t.srcAttr + \" ~~~ \" + t.dstAttr + \" => \" + t.attr))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should see the following output:\n",
    "```\n",
    "Song(Fancy (w\\/ T.I. & Swizz Beatz),Drake,Set(rap, wdzh-fm, 2010, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) => 0.8571428571428571\n",
    "Song(Roll With It,Easton Corbin,Set(new country, modern country, country, great song, 2010, my favorite)) ~~~ Song(You Lie,The Band Perry,Set(new country, modern country, country, great song, female vocalists, my favorite)) => 0.7142857142857143\n",
    "Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(Fancy (w\\/ T.I. & Swizz Beatz),Drake,Set(rap, wdzh-fm, 2010, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) => 0.8571428571428571\n",
    "Song(Any Girl (w\\/ Lloyd),Lloyd Banks,Set(rap, wdzh-fm, hip hop, wjlb-fm, whtd-fm, wkqi-fm)) ~~~ Song(I'm Going In (w\\/ Young Jeezy & Lil Wayne),Drake,Set(rap, hip-hop, wdzh-fm, wjlb-fm, whtd-fm, wkqi-fm)) => 0.7142857142857143\n",
    "Song(Everything Falls,Fee,Set(rock, worship, christian, contemporary christian, good stuff, christian rock)) ~~~ Song(Needful Hands,Jars Of Clay,Set(rock, worship, christian, contemporary christian, christian rock, favorites)) => 0.7142857142857143\n",
    "Song(Bring The Rain,MercyMe,Set(rock, worship, christian, contemporary christian, uplifting, christian rock, powerful, favorites)) ~~~ Song(Needful Hands,Jars Of Clay,Set(rock, worship, christian, contemporary christian, christian rock, favorites)) => 0.75\n",
    "```\n",
    "We can see that \"Fancy\" by \"Drake\" is similar to \"Any Girl\" by \"Lloyd Banks\". Of course, they are rap songs. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "spylon-kernel",
   "language": "scala",
   "name": "spylon-kernel"
  },
  "language_info": {
   "codemirror_mode": "text/x-scala",
   "file_extension": ".scala",
   "help_links": [
    {
     "text": "MetaKernel Magics",
     "url": "https://metakernel.readthedocs.io/en/latest/source/README.html"
    }
   ],
   "mimetype": "text/x-scala",
   "name": "scala",
   "pygments_lexer": "scala",
   "version": "0.4.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
